Entity based Q&A Retrieval
نویسنده
چکیده
Bridging the lexical gap between the user’s question and the question-answer pairs in the Q&A archives has been a major challenge for Q&A retrieval. State-of-the-art approaches address this issue by implicitly expanding the queries with additional words using statistical translation models. While useful, the effectiveness of these models is highly dependant on the availability of quality corpus in the absence of which they are troubled by noise issues. Moreover these models perform word based expansion in a context agnostic manner resulting in translation that might be mixed and fairly general. This results in degraded retrieval performance. In this work we address the above issues by extending the lexical word based translation model to incorporate semantic concepts (entities). We explore strategies to learn the translation probabilities between words and the concepts using the Q&A archives and a popular entity catalog. Experiments conducted on a large scale real data show that the proposed techniques are promising.
منابع مشابه
Question Answering with LCC's CHAUCER-2 at TREC 2007
In TREC 2007, Language Computer Corporation explored how a new, semantically-rich framework for information retrieval could be used to boost the overall performance of the answer extraction and answer selection components featured in its CHAUCER-2 automatic question-answering (Q/A) system. By replacing the traditional keyword-based retrieval system used in (Hickl et al. 2006c) with a new indexi...
متن کاملCategory-Based Query Modeling for Entity Search
• Entity ranking: topic consists of a keyword query (Q) and target categories (C) • List completion: the topic also specifies example entities (E) • Users often look for specific entities instead of documents mentioning them • Entities represented by their Wikipedia page • Introduction a general probabilistic framework for entity retrieval • Focus on the use of category information in a theoret...
متن کاملBangor at TREC 2003: Q&A and Genomics Tracks
We present the QITEKAT Question-Answering system based on the conceptual theory of Knowing About Knowledge, which adopts an agent-based approach to extract information from suitable corpora. The components of the QITEKAT system entered by the School of Informatics, University of Wales, Bangor, in the 2003 Text Retrieval Conference are described in detail. We describe PPM compression techniques ...
متن کاملA Comparative Study of Word Co-occurrence for Term Clustering in Language Model-based Sentence Retrieval
Sentence retrieval is a very important part of question answering systems. Term clustering, in turn, is an effective approach for improving sentence retrieval performance: the more similar the terms in each cluster, the better the performance of the retrieval system. A key step in obtaining appropriate word clusters is accurate estimation of pairwise word similarities, based on their tendency t...
متن کاملپیکره اعلام: یک پیکره استاندارد واحدهای اسمی برای زبان فارسی
Named entity recognition (NER) is a natural language processing (NLP) problem that is mainly used for text summarization, data mining, data retrieval, question and answering, machine translation, and document classification systems. A NER system is tasked with determining the border of each named entity, recognizing its type and classifying it into predefined categories. The categories of named...
متن کامل